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Abstract 

We consider two-tier voting system and try to determine optimal weights for a 
fair representation in such systems. A prominent example of such a voting system 
is the Council of Ministers of the European Union. Under the assumption of inde- 
pendence of the voters, the square root law gives a fair distribution of power (based 
on the Penrose-Banzhaf power index) and a fair distribution of weights (based on the 
concept of the majority deficit), both given in the book by Felsenthal and Machover. 
In this paper, special emphasis is given to the case of correlated voters. The coopera- 
tive behaviour of the voters is modeled by suitable adoptions of spin systems known 
from statistical physics. Under certain assumptions we are able to compute the op- 
timal weights as well as the average deviation of the council's vote from the public 
vote which we call the democracy deficit. 

Acknowledgement This paper has been presented at the Leverhulme Trust sponsored 
Voting Power in Practice Symposium held at the London School of Economics, 20-22 
March 2011. 



1 Introduction 

In this paper, we consider two-tier voting systems. The first level of such a systems usually 
consists of the voters in a country or an association of countries. The voters in each 
constituency (or member country) are represented by a delegate in the second level voting 



*Werner Kirsch, ElFakultat fur Mathematik und Informatik, FernUniversitat Hagen, D-58095 Hagen, 
Germany, werner.kirsch@fernuni-hagen.de 

^Jessica Langner, ElFakultat fur Mathematik und Informatik, FernUniversitat Hagen, D-58095 Hagen, 
Germany, jessica.langner@fernuni-hagen.de 



1 



system, the council. Delegates in the council are given a voting weight which as a rule 
depends on the population of the constituency they represent. 

Examples of such two-tier voting systems are the Council of Ministers of the Euro- 
pean Union, the Electoral College in the USA and the 'Bundesrat', the state chamber of 
Germany's parliamentary system. In each case we assume that the representatives vote 
according to the majority vote in their respective constituency. 

What is a fair voting weight for a delegate in a council? This question arises immedi- 
ately in all these examples. It seems self-evident that for a fair voting system the voting 
outcome in the council should agree with the result of a popular vote. The US presidential 
elections 2000 show that this is not always the case. While Al Gore won the public vote 
the majority in the Electoral College elected Georg W. Bush as the 43rd president of the 
USA. The difference between the voting result in the council and the public vote is called 
the 'democracy deficit'. 

In fact, it is not hard to see, that no voting system for the council can guarantee that 
the vote in the council and the public vote agree. In other words, no matter how we choose 
the voting weights for the council members, the democracy deficit cannot be zero for all 
possible distributions of 'yes'- and 'no'- votes among the voters. Thus, the best one can do 
is to minimize the expected democracy deficit, i. e., the difference between the vote in the 
council and popular vote. Obviously, the term 'expected' needs a careful interpretation. If 
one assumes that all voters cast their votes independently of each other then one can show 
that the expected democracy deficit is minimized if the voting weight of a representative 
is chosen proportional to the square root \fN\, of the population (N u ) of the respective 
country (with number v). 

This is (one version of) the celebrated 'square root law' by Penrose (see 
[Felsenthal and Mac hover 19 981 and UPenrose 19461 ). In this paper, we go beyond the 
square root law by dropping the assumption of the voters' independence. We apply two dif- 
ferent schemes to model the correlation between the voters. In our main model we assume 
that the voters are influenced by a 'common belief of the society or -which is the same, 
technically speaking- by a strong group of opinion makers. We call this system the CBM 
(for 'common belief model' or 'collective bias model') (see HKirsch 200 71). The CBM 
can be looked upon as a generalization of a model proposed by Straffin [Straffin 1977M in 
connection with the Shapley-Shubik power index (see [ |Shapley and Shubik 1954[ ). The 
other model we look at takes into account that voters influence each other. It is based on a 
model (the Curie- Weiss Model) for ferromagnetic behaviour taken from statistical physics 
(see IIKirsch 200711 andcf. Ellis 19851 [Thompson 197"2p . 

If we assume that the voters in different countries vote independently of each other, 
we can compute the optimal voting weights in terms of the expected margins of the voting 
outcome in the countries. For the CBM the optimal weights are proportional to the pop- 
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ulation N v . We also compute the expected democracy deficit for these models (for large 
N„). 

Under the assumption that the voters influence each other also across country borders 
(according to the CBM) we can also compute the expected democracy deficit asymptoti- 
cally. It turns out that in this case any voting weight is as good as any other one. In other 
words, on an asymptotical scale any distribution of voting weights is close to optimal. 



2 The General Model 

We consider a situation where M states (countries, constituencies) form a federation. The 
states are labeled by Greek characters, e. g., v, k, . . .. The number of voters (population) 
of the state v is denoted by N v . Consequently, the total population of the union is given 

We represent the vote of the voter % in state v by X ui . This voter may vote either 'yes', 
in which case we set X vi = 1 or 'no' encoded asl^ = —1. Consequently, the result 
of a simple majority voting in the state v is represented by the sum S u = Y^h=i X V i. A 
voting in that state is affirmative if S v > 0. For the simplicity of notation and to avoid 
nonsignificant technicalities we assume that all N v are odd numbers, this excludes a draw 
described by S u = 0. 

We denote the voting decision in the state v by Xv = Xv{S v ) which we set equal to 1 
if S v > and equal to —1 if S v < 0. Thus, the representative of state v will vote 'yes' if 
Xv — 1 an d 'no' if Xv = — 1- For later use we note that XvS v = \S V \. 

If we denote the voting weight for state v in the council by g v then the voting result in 
the council is given by 

M 

C = Yl 9uXu- (1) 

u=l 

This voting result has to be compared with the popular vote given by 

M 

(2) 
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We call the absolute value of the difference between C and P the democracy deficit and 
denote it by A 

A = \ C-P\ (3) 

M M 

(4) 



Y 9vXv ~ Y S u 



3 



The democracy deficit A depends explicitly on the voting weights g 1 ,...,g M - The 
voting weights should be chosen in such a way that the democracy deficit is as small as 
possible. 

The voting results X vi are the voter's reaction on a particular proposal co. Hence, the 
democracy deficit A depends on the given proposal u as well. It is easy to choose the 
weights g v such that A vanishes for a given proposal. But our goal is to optimize the 
weights in such a way that A is small for most proposals. Thus, we look at the expected 
value of A 2 , denoted by 



We will call D the expected democracy deficit in the following (instead of the correct but 
clumsy 'expected square of the democracy deficit'). 

By looking at expectation values we regard the proposals as random input to the voting 
system. Hence the probability that the next proposal to the system is a particular proposal 
cu is determined by a probability rule. We assume that there is no bias to certain proposals, 
in particular any proposal and its counterproposal have the same probability. 

The voting system reacts in a deterministic (and rational) way to this random input. 
The voting results as well as the democracy deficit are therefore (otherwise deterministic) 
functions of the random input, the proposal. The voting outcome is a vector in the space 
Vl = { — 1, where N is the total number of voters and the probability distribution 
of the proposals equips f2 with probability distribution P as well, namely the probability 
of a given outcome (X x , . . . , X N ) is the probability of all proposals u that lead to that 
outcome. Since the voters react rationally they vote —1 on the opposite to a proposal they 
would favour and vice versa. Hence the probability distribution P satisfies 



We call such a measure a voting measure. For any voting measure we have P(Xj = 
1) = P(Xj = —1) = |, but probabilities concerning more than one voter, like P(Xl = 
1 and X 2 — 1) cannot be computed from the mere assumption that P is a voting measure. 
Such events concern the correlation structure of the measure and they have yet to be fixed 
depending on the situation at hand. One possible specification is the assumption that all 
voters act independently of each other. This leads to the property that 




(5) 



F(X 1 ,...,X N ) = F(-X 1 ,...,-X N ). 



(6) 



P(Xi = 1 and X 2 = 1) = F(X X = 1) • F(X 2 = 1) 



i 
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More generally, under the assumption of independence we have 



— £i, X 2 — £2, • ■ ■ j X N — £jv) 




(7) 
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for any £1, . . . ,£iv G { — 1 ; !}■ The voting measure describes the mutual influence of 
the voters on each other, mathematically speaking it describes the correlation structure of 
the voting system. The above example describes independent voters - in some sense the 
classical case of the theory. An extreme case is given by the measure F u 

P U (X! = 1,X 2 = 1,...,X N = 1) = F U (X, = -1,X 2 = -l,...,X N = -1) 



For this (rather boring) voting measure the only possible outcomes are the unanimous 
votes, it represents total (positive) correlation. 

If P is a voting measure, we denote the expectation value with respect to P by E, as 
was already anticipated in ©. Since we assume that the numbers N v are odd, it follows 
that S v 7^ 0. From this we conclude that E{ Xv ) = for any voting measure. 

3 Optimal Weights for Independent States 

We begin by determining optimal weights, under the assumption that voters in different 
states are independent. Thus, we assume that the random variables X ui and X K j are inde- 
pendent for v 7^ k. 

We want to minimize the function 



D( Tl ,..., 7M ) = E(A( 7i ,..., 7a/ ) 2 ) 

M 

= 2 (l^iXuXn) -2j v E( Xv S K ) + E(S v S K )y (9) 

U,K=1 

The function D( 7 i, . . . , jm) is a measure for the expected democracy deficit for voting 
weights 7 i, . . . , 7M . 

By the assumption of independent states we can conclude that 

E( X uXk) = E( X „)E(x«) =0 for*/ t^, (10) 

E( Xu S K ) = E( Xu )E(S K ) =0 for^K, (11) 

and 

E(S U S K ) = E(S U )E(S K ) =0 foru^K. (12) 
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Moreover, we have xl — 1 and XuS„ = \S U \, thus 

M 

D( 7 i,...,7m) = Yl if- ~ 2 7^(|^|) + E(^ 2 )). (13) 

u=l 

It is not hard to find the minimizing weights g v (by the usual procedure: Find the zeros of 
the derivative), in fact: The weights gi, . . . , g M which minimize the function D are given 
by 

= E(|S„|). (14) 

This result has a very intuitive interpretation. The quantity S v is the difference between 
the 'yes'-votes and the 'no'-votes, so (S^l describes the margin of the voting outcome, i. e., 
the surplus of votes of the winning party. Therefore, the optimal weights g v for the state 
v are given by the expected margin of a vote in that state. In fact, the delegate of state v 
does not represent the opinion of all voters in this state, but only those who agree with the 
majority, he or she acts against the will of the minority, so as a net result the delegate just 
represents the margin. 

We can also compute the expected democracy deficit D for the optimal weights g 1} . . . ,g M 

M M 

B( 9l ,...,g M ) = ( E (N 2 )- E (N) 2 ) = E v (l^l) < 15 ) 

u=l v=\ 

where V(|(S„|) denotes the variance of the random quantity \S V \. 

We emphasize that we did not yet make assumptions about the correlation structure of 
voters inside a country. Of course, the numerical evaluation of the optimal weights and 
minimal democracy deficit requires further assumptions on the correlation between voters. 



4 Independent Voters 

In this section we assume that all voters act independently of each other, in mathematical 
terms: all random variables X ui are independent of each other. Under this assumption we 
can compute the optimal weight g v = ~&(\S V \) as well as the minimal expected democracy 
deficit. 

For the independent random variables X vi we have the central limit theorem, namely 
the weighted sums 

11^ 
-i=S v := -== V X vi (16) 
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are asymptotically distributed for large N v according to a standard normal distribution (cf. 



Lamperti [Lamperti 1996 1). From this it follows that for large N v 



E(\S V \) « ^v 7 ^ (17) 



E(|^| 2 ) « y/W v , (18) 

and 

V(\S U \) « — iV,. (19) 

We conclude that the optimal weight for independent voters is proportional to the square 
root of the population. This is exactly the content of the square root law by Penrose (see 
HPenrose 19461 and HFelsenthal and Machover"T998H ). 

The above formulae also allow us to evaluate the minimum of the expected democracy 
deficit 

B{g u ...,g M ) « ^^N. (20) 
This implies that the expected democracy deficit per voter, namely 

A^ 2N 



E I I „ ) I (2D 



converges to zero as N becomes large (with convergence rate -M. 



5 The Collective Bias Model 

Now, we introduce and discuss a model for collective behaviour of voters. The basic idea 
is that there is a mainstream opinion, e. g., a common belief due to the country's tradition 
or the influence of opinion makers. For a given proposal u> we model this 'common belief 
by a value C £ [— 1, 1] which depends on the proposal at hand. The value £ = 1 means 
there is such a strong common belief in favor of the proposal that all voters will vote 'yes', 
C = — 1 means all voters will vote 'no'. In general, ( denotes the expected outcome of 
the voting, i. e., E,(X ui ). The voting results X vi themselves fluctuate around this value 
randomly. 

Let us be more precise about this. Suppose the voting results are X 1 , . . . , X N (where 
we dropped the index v for notational simplicity). Let \x be a measure on [—1,1], which 
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is the distribution of the common belief value £, that is //(]a, b[) is the probability that the 
value £ is between a and b. Let be the probability measure on { — 1, 1} with 

P c (x 1 = i)= Pc = i(i + 0, 

so that 

£ C (X0 := P C (X X = 1) - P C (X X = -1) = p c - (1 - p c ) = C ■ 
For a given value of ( we set 

N 

i=i 

For any £ G [—1,1] the expression 7^ is a probability distribution on = { — 1, 1}^. We 
define the collective bias measure P M with respect to fx as 

P M (X X = 6, . . . , X N = Cat) := | V c (^, £ n ) d^O . (23) 

Note, that 7\ is rcctf a voting measure (unless ( = |). However P M /.v a voting measure 
if fi is invariant under sign change, i. e., f^Qa, b[) = fiQ — b, — a[). We call fi the bias 
measure. 

If the measure fi is concentrated in 0, then P M makes the voting results Xj independent, 
thus we are in the case of section [4] If [x is the uniform distribution on [—1, 1] (that is 
every point is equally likely), then the corresponding measure was already considered by 
Straffin [Straffi n~19771 where he established an intimate connection of this model to the 
Shapley-Shubik index. In a similar way, the Penrose-Banzhaf measure is connected with 
the model of independent voters. 

The CBM can be looked upon as a model for spins in statistical mechanics. There 
the voters are replaced with elementary magnets (spins) which can be directed upwards 
(Xi = 1) or downwards (Xi = —1). In this language the Collective Bias Model describes 
spins which do not interact with each other but are influenced by an exterior magnetic 
field, namely the collective bias £. 

In the papers HKirsch 2007L JKirsch and Langner 2012 1 and [ Langner 2012[ we in- 



vestigate also another model for collective voting behaviour which comes directly from 
statistical physics, the Curie-Weiss Model (CWM). In this model the spins (voters) influ- 
ence each other by an interaction which makes spins to prefer to be directed parallel to the 
others. For voting this means that voters prefer to agree to the other voters. The Curie- 
Weiss Model is a very interesting tool to investigate collective behaviour. However, it is 
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technically more involved than the other models we discuss. Therefore, we will mention 
it only rather briefly and refer to the papers mentioned above for more details. 
Let us define 

i N 2 

H(X 1 ,...,X N ) = --(^X t ) . (24) 

i=i 

This is the energy function for the spin configuration Xi, . . . , Xjy. We use this to define 
measures 

Qp{X u ...,X N ) = e-W*"'"**) (25) 

where (3 e]0, oof is the inverse temperature in statistical physics. As a rule, Qp is not a 
probability measure, so we normalize it by dividing through its total mass Z and set 

-f3H(X u ...,X N ) 

Pp(X u ...,X N ) = . (26) 

This is the Curie- Weiss measure for inverse temperature (3. The parameter (3 measures the 
strength of the interaction between the voters. The extreme case (3 = corresponds to the 
model of independent voters, the other extreme (3 = oo describes the case of the measure 
P„ defined in © for unanimous voting. 



6 Optimal weights for the Collective Bias Model 

Let us now suppose that voters in different countries are independent, but voting inside the 
countries follows the CBM with bias measure fi. According to section |3] in this case the 
optimal weights are given by 

9u = E„(N)- < 27 ) 

For large N v we have 

g v = E M (|^|) =^N U (28) 

where [i\ — f ICMMC) ls m e first absolute moment of Note, that for any probability 
measure fi the quantity n x is non zero, except for the case fi = 5 , the measure is concen- 
trated at the point 0. This means that the optimal weights for a council are proportional 
to the population of the respective country if the voters can be described by a CBM. This 
also includes the Straffin case (fi is the uniform distribution), which corresponds to the 
Shapley-Shubik power index. 

The only exception from proportionality is the case fi = do corresponding to indepen- 
dent voting (the Penrose-Banzhaf case), where the square root law applies. 

We mention that there is a 'phase transition' for the Curie- Weiss Model if we vary (3 
from to oo, namely 
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(29) 



= < 



CN U \ 



for/3 = 1; 



( C(0) N t 



for/3 > 1. 



The constant C(/3) converges to as (3 \ 1 and to 1 as f3 /* oo. 

7 Democracy Deficit for the Collective Bias Model 

Given the optimal weights (1281) for the CBM (and independent states) we can compute (the 
asymptotic behaviour of) the expected democracy Deficit D M 



where fii = J |CMMC) an d [i-i — J KI^MO- Note that /i 2 — n\ ^ unless is con- 
centrated in at most two points. It follows that the expected democracy deficit per voter, 
i. e., 



converges to a positive constant as the N v tend to infinity (in a uniform way, i. e., N u = 
a v N). 

It is interesting to remark that the expected democracy deficit per voter converges also 
to a constant if we choose a non optimal voting weight, like for instance g v ~ y/Nu or 
g v — 1 for all v. This constant will in general be larger than the one for the optimal 
weights, but the order of magnitude of D is not changed. 

For the Curie- Weiss Model the expected democracy deficit per voter converges to zero 
(for f3 1 even with rate 4). 



So far we have always assumed that voter in different states act independently. In this 
section we consider the case of collective behaviour across country borders. We assume 



M 




(30) 




8 A Model with Global Collective Behaviour 
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that all voters act according to the Collective Bias measure P M . This means there is a 
common belief, expressed through the measure [L, for all voters in the union. 

Then, the formulae (flOl) - (fT3l) are no longer valid. In fact, determining the optimal 
voting weights requires to solve a rather complicated system of M dependent linear equa- 
tions. Instead of doing this we try to look at the democracy deficit directly. It turns out 
that for large N v we have for any u, k 

MxuX.) ~1, (3D 
MXuSj « E M (|5„|) « fJnN K (32) 

and 

E(S V S K ) « (J*N V N K . (33) 
Inserting these terms into the expression for D we obtain 

M MM M 

r-iStf) = E ®I*(Xi>Xk) 9v9k ~ 

K,«=l ^=1 K = l l/,K=l 

M MM M 

~ E 9u9k ~ 2 E 9v E ^ ^ + E ^ 2 

^,K=l i/=l K = l Z/,K=1 

M \ 2 M 



E^ - 2^(E»w + ^iv 2 
v i/=i y i/=i 

G 2 - 2/nG + /i 2 iV 2 . (34) 



This last expression depends only on the sum G = ^2 V=1 Qv of the voting weights and not 
on the single weight g v . This means that for large N v the asymptotic value of D does not 
depend on the way the weights are distributed among the member states of the union. The 
minimal value of D is obtained by choosing G = fi\N independent of the values of the 
particular weight g v . We also note that the value of G has no real meaning, since we don't 
change the voting system at all if we multiply all weights (and the quota) with the same 
number C > 0. 

Finally, we remark that the somewhat hand waving arguments in (l34l) need a careful 
mathematical interpretation. A precise formulation gives: 

lim Eif Ato -,V' 9M) ) 2 ) = (35) 



N^OO P V V N 
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for G = Y^u=i 9u = PiN and 



for any arbitrary choice of g v . This result can be interpreted in the following way: If there 
is a strong common belief in the union across border lines then it doesn't matter how one 
distributes the voting weights in the council. 

References 

[Banzhaf 1965] Banzhaf JF (1965) Weighted voting doesn't work: a mathematical analysis. Rut- 
gers Law Review 19: 317-343. 

[Ellis 1985] Ellis R (1985) Entropy, large deviations and statistical mechanics. Grundlehren der 
mathematischen Wissenschaften, vol. 271. Berlin Heidelberg New York: Springer. 

[Felsenthal and Macho ver 1995] Felsenthal DS, Macho ver M (1995) Postulates and paradoxes of 
relative voting power - A critical re-appraisal. Theory and Decision 38 (2): 195-229. 

[Felsenthal and Machover 1998] Felsenthal DS, Machover M (1998) The measurement of voting 
power: Theory and practice, problems and paradoxes. Edward Elgar, Cheltenham, forthcoming. 

[Kirsch 2004] Kirsch W (2004) What is a Fair Distribution of Power in the Council of Ministers 
of the EU? |http : / /www . ceps . be /Article . php?article_id=3 60l 

[Kirsch 2007] Kirsch W (2007) On Penrose's Squareroot Law and Beyond. Homo Oeconomicus 
24 (3,4): 357-380. 

[Kirsch and Langner 2012] Kirsch W, Langner J (2012) Mathematical Theory of Correlated Vot- 
ing, in preparation. 

[Lamperti 1996] Lamperti JW (1996) Probability: A Survey of the Mathematical Theory, 2nd 
Edition, Wiley. 

[Langner 2012] Langner J (2012) PhD-thesis, in preparation. 

[Penrose 1946] Penrose LS (1946) The elementary statistics of majority voting. Journal of the 
Royal Statistical Society 109: 53-57. 

[Shapley and Shubik 1954] Shapley LS, Shubik M (1954) A Method for Evaluating the Distribu- 
tion of Power in a Committee System. American Political Science Review 48: 787-792. 

[Straffm 1977] Straffin PD (1977) Homogeneity, Independence, and Power Indices. Public Choice 
30: 107-118. 

[Thompson 1972] Thompson C (1972) Mathematical Statistical Mechanics. Princeton University 
Press. 



12 



